Temporal Lift Pooling for Continuous Sign Language Recognition
نویسندگان
چکیده
AbstractPooling methods are necessities for modern neural networks increasing receptive fields and lowering down computational costs. However, commonly used hand-crafted pooling approaches, e.g., max average pooling, may not well preserve discriminative features. While many researchers have elaborately designed various variants in spatial domain to handle these limitations with much progress, the temporal aspect is rarely visited where directly applying or specialized be optimal. In this paper, we derive lift (TLP) from Lifting Scheme signal processing intelligently downsample features of different hierarchies. The factorizes input signals into sub-bands frequency, which can viewed as movement patterns. Our TLP a three-stage procedure, performs decomposition, component weighting information fusion generate refined downsized feature map. We select typical task long sequences, i.e. continuous sign language recognition (CSLR), our testbed verify effectiveness TLP. Experiments on two large-scale datasets show outperforms by large margin (1.5%) similar overhead. As robust extractor, exhibits great generalizability upon multiple backbones achieves new state-of-the-art results CSLR datasets. Visualizations further demonstrate mechanism correcting gloss borders. Code released (https://github.com/hulianyuyy/Temporal-Lift-Pooling). KeywordsLifting schemeContinuous recognitionTemporal
منابع مشابه
Video Analysis for Continuous Sign Language Recognition
The recognition of continuous, natural signing is very challenging due to the multimodal nature of the visual cues (fingers, lips, facial expressions, body pose, etc.), as well as technical limitations such as spatial and temporal resolution and unreliable depth cues. On the other hand, signing gestures are designed to be robustly discernible. We therefore argue in favor of an integrative appro...
متن کاملDeep Sign: Hybrid CNN-HMM for Continuous Sign Language Recognition
This paper introduces the end-to-end embedding of a CNN into a HMM, while interpreting the outputs of the CNN in a Bayesian fashion. The hybrid CNN-HMM combines strong discriminative abilities of CNNs with sequence modeling capabilities of HMMs. Most current approaches in the field of gesture and sign language recognition disregard the necessity of dealing with sequence data both for training a...
متن کاملSign Language Recognition Using Temporal Classification
In the US alone, there are approximately 900,000 hearing impaired people whose primary mode of conversation is sign language. For these people, communication with non-signers is a daily struggle, and they are often disadvantaged when it comes to finding a job, accessing health care, etc. There are a few emerging technologies aimed at overcoming these communication barriers, but most existing so...
متن کاملA Real-Time Continuous Gesture Recognition System for Sign Language
In this paper, a large vocabulary sign language interpreter is presented with real-time continuous gesture recognition of sign language using a DataGlove. The most critical problem, end-point detection in a stream of gesture input is first solved and then statistical analysis is done according to 4 parameters in a gesture : posture, position, orientation, and motion. We have implemented a proto...
متن کاملAppearance-Based Features for Automatic Continuous Sign Language Recognition
This diploma thesis investigates appearance-based features for the person-independent vision-based recognition of continuous sign language. A large variety of methods which have been successfully used for automatic speech recognition is applied to this task. Appearance-based approaches do not rely on a segmentation of the images or on predefined models of the image content and use the image its...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
ژورنال
عنوان ژورنال: Lecture Notes in Computer Science
سال: 2022
ISSN: ['1611-3349', '0302-9743']
DOI: https://doi.org/10.1007/978-3-031-19833-5_30